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Significant progress has been made in the development of subgrid scale (SGS) closures based 
on a filtered density function (FDF) for large eddy simulations (LES) of turbulent reacting flows. 
The FDF is the counterpart of the probability density function (PDF) method, which has proven 
effective in Reynolds averaged simulations (RAS). Flowever, while systematic progress is being 
made advancing the FDF models for relatively simple flows and lab-scale flames, the application 
of these methods in complex geometries and high speed, wall-bounded flows with shocks remains a 
challenge. The key difficulties are the significant computational cost associated with solving the 
FDF transport equation and numerically stiff finite rate chemistry. For LES/FDF methods to 
make a more significant impact in practical applications a pragmatic approach must be taken that 
significantly reduces the computational cost while maintaining high modeling fidelity. An example 
of one such ongoing effort is at the NASA Langley Research Center, where the first generation 
FDF models, namely the scalar filtered mass density function (SFMDF) are being implemented 
into VULCAN, a production-quality RAS and LES solver widely used for design of high speed 
propulsion flowpaths. This effort leverages internal and external collaborations to reduce the 
overall computational cost of high fidelity simulations in VULCAN by: implementing the high 
order methods that allow reduction in the total number of computational cells without loss in 
accuracy; implementing first generation of high fidelity scalar PDF/FDF models applicable to high- 
speed compressible flows; coupling RAS/PDF and LES/FDF into a hybrid framework to efficiently 
and accurately model the effects of combustion in the vicinity of the walls; developing efficient 
Lagrangian particle tracking algorithms to support robust solutions of the FDF equations for 
high speed flows; and utilizing finite rate chemistry parametrization, such as flamelet models, to 
reduce the number of transported reactive species and remove numerical stiffness. This paper 
briefly introduces the SFMDF model (highlighting key benefits and challenges), and discusses 
particle tracking for flows with shocks, the hybrid coupled RAS/PDF and LES/FDF model, flamelet 
generated manifolds (FGM) model, and the Irregularly Portioned Lagrangian Monte Carlo Finite 
Difference (IPLMCFD) methodology for scalable simulation of high-speed reacting compressible 
flows. 


* Research Aerospace Engineer, Hypersonic Airbreathing Propulstion Branch, AIAA Senior Member. 

' Graduate Student, Department of Mechanical and Aerospace Engineering, AIAA Student Member. 

^Graduate Student, Mechanical Engineering and Materials Science, AIAA Student Member. 

§ Research Assistant Professor, Center for Simulation and Modeling, AIAA Senior Member. 


1 of 17 


American Institute of Aeronautics and Astronautics 


I. Introduction 


Recent flight demonstrations of hypersonic air-breathing vehicles 1,2 prove their increasing promise for military 
(rapid response and strike capability on global scale), aerospace (safer and more affordable access to space), and 
civil aviation (hypersonic transport) applications. Currently, these technologies are in their early development stages 
with commercial interest and internal research and development investment at a small fraction of that of government 
organizations such as National Aeronautics and Space Administration (NASA) and Air Force Research Laboratories 
(AFRL). To advance hypersonic air-breathing technologies to the technology readiness level (TRL) necessary for 
wide spread commercialization, further government investments are needed over the next decade and likely beyond. 
However, designing devices capable of robust hypersonic air-breathing operation, characterized by rapid mixing and 
short combustion times while ensuring flame stability, has proven difficult because the inherent high flow through 
velocities currently challenge both experimental and computational capabilities. In particular, current experimental 
facilities are sub-scale allowing for lx-lOx mass flow rates (with lx equal to 10 lbm/sec of air flow) where lOOx 
scale ups are needed. Flow diagnostics are typically limited to wall pressures and temperatures with non-intrusive, 
multi-dimensional, flow field information needed to advance knowledge and modeling. The current computational 
capabilities have shortcomings in both numerical methods (e.g. high order shock capturing) and physical models (e.g. 
compressible turbulence, turbulence-chemistry interaction, finite-rate kinetics) appropriate to these high velocity flow 
regimes. 

Internal high-speed flows, such as those typically found in scramjet combustors, pose a particular challenge for 
computational fluid dynamics (CFD) because of complex, non-linear interactions between viscous wall-bounded flows, 
mixing layers, shocks, and thermodynamic and chemical non-equilibrium flow physics. More specifically, scramjet 
combustors are required to operate near optimum efficiency with minimal aerodynamic loss in flow regimes where 
flow and chemical time scales are of the same order. Therefore, the interactions of turbulence and finite-rate chemical 
kinetic effects will play a leading role in determining flow and flame characteristics and must be modeled with high 
fidelity. Such similarity of scale can also cause the flow regime to be highly unstable and susceptible to flame extinction 
and re-ignition phenomena which can have a detrimental impact on overall combustor operation and lead to a loss of 
thrust, narrowing of operability margins, and engine unstart. Unfortunately, due to these multi-scale, multi-physics 
interactions, this flow regime is also extremely challenging to model because no assumptions can be accurately 
postulated about the relationships between time and length scales of the multi-physics phenomena. Therefore, the 
dynamic interactions of multi-physics, multi-scale phenomena must be simulated rather than modeled. 

The need for predictive simulation of multi-scale flows has led to the development of large eddy simulation (LES). 
In LES, the turbulence is separated into large-scale, grid-resolved, and geometry-influenced flow motions, and small- 
scale, unresolved but universally-behaving modeled motions. 2 3 This “divide-and-conquer” approach utilizes brute force 
computing to resolve the most challenging dynamical interactions of the flow and relies on “sub-grid” models for 
the more universal small-scale flows. Because of this, LES is inherently unsteady and computationally expensive. 
Nevertheless, significant advancements in computational hardware, numerical algorithms, and physics models over 
the last two decades have increased our ability to employ LES for the unsteady simulation of reacting flows in practical 
devices. 4,5 LES also offers a substantial computational cost reduction over direct numerical simulation (DNS). This 
cost reduction is proportional to the cube power of the sub-grid Reynolds number. Re a - 6 The value of the sub-grid 
Reynolds number for a typical LES of practical interest varies from 10-100 (or more) depending on the characteristic 
flow Reynolds number. This represents a dramatic three -to-six orders of magnitude computational cost reduction that 
makes LES of practical systems possible and the corresponding DNS difficult to imagine in the foreseeable future. 
Fortunately, numerical algorithm requirements for both LES and DNS are the same; therefore any tools developed for 
LES are directly applicable for DNS. The latter can then be used to elucidate physical interactions in computationally 
accessible canonical problems 7-10 which can further lead to the development of improved physics models. 

Not all physical phenomena, however, bear strong functional dependence on the LES-resolved multi-scale dynam- 
ics. While the LES predictions of geometry -governed mixing processes, such as swirling, recirculation, or vortex 
breakdown, are greatly improved over those using Reynolds averaged simulations (RAS), the prediction of turbulent- 
reacting flows remains a challenge. This is because combustion phenomena in both premixed and non-premixed 
systems takes place when the reactants experience transport at the molecular level (i.e. sub-grid). For example, in 
non-premixed systems, the LES-resolved processes only serve to “wrinkle and strain” the interface between the re- 
actants thereby increasing the contact surface area. While accurate prediction of the total “contact” area is important 
for combustion, only the action of molecular diffusion mixes the reactants in a way that enables chemical reactions to 
occur. Similarly, for premixed combustion, the turbulent stirring processes only “move around” cold premixed pockets 
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of fuel and oxidizer, whereas it is the molecular thermal diffusion (diffusion of heat away from the reaction zone and 
into the unburned pockets of the mixture) that increases the temperature of the premixed reactants to levels required for 
self-sustained chemical reactions. As suggested above, both molecular and thermal diffusion are transport processes 
that directly influence chemical reactions. But since they occur at the smallest turbulent length scales (dissipation 
scales), they are unresolved in LES. This implies that the advantages of LES over RAS do not fully apply in turbulent 
reacting flows. 

The fact that the benefits of LES over RAS cannot be effectively realized in reacting flows is disappointing but 
should not be discouraging. This is because over the past forty years researchers have been actively developing 
turbulence-chemistry interaction models for reacting RAS. The successes achieved by RAS modelers provide confi- 
dence in continued use of RAS in design and justify further development of models. Furthermore, given that challenges 
associated with turbulence-chemistry interaction modeling are comparable between LES and RAS, the experience 
gained during RAS development can be directly leveraged to develop corresponding models for LES. This strategy 
has been effectively used during the last twenty years. 

The majority of current models for turbulence-chemistry interaction for LES (and RAS) are based on statistical 
approaches. A few exceptions are models based on deconvolution, 1 13 fractal dynamics, 1415 and stochastic descrip- 
tions of the structure of turbulence dynamics. 16,17 Reviews of turbulent combustion modeling for LES of low-speed 
and high-speed flows are provided by Pitsch, 18 and Drummond, 19 respectively. The commonly used statistical models 
define a joint probability density function (PDF) of density, reacting scalars, temperature, pressure (for high-speed 
flows), and certain turbulence quantities such as turbulence mixing frequency. The primary benefit of PDF-based 
models is that the averaged, or filtered, chemical reaction terms appear in a closed form. This is not a trivial accom- 
plishment because, as argued in the previous paragraph, the chemical reactions are unresolved by LES. Furthermore, 
the closed-form model for the chemical reactions allows for dynamic coupling between the sub-grid and resolved 
processes thereby promising significant improvements in predictive fidelity. The simplest of PDF models assume the 
shape and nature of statistical relationships (e.g. statistical independence) among the state variables. 20,21 Higher fi- 
delity models, however, rely on solving transport equations governing those shapes and statistical relationships. 22-31 
This class of models, termed filtered density function (FDF) to distinguish them from PDF-based models used in RAS, 
has proven effective for LES of reacting flows. 5,28 ' 32,33 In this paper, the Lagrangian version of the joint scalar (com- 
position) FDF (SFDF) is considered for applications in high speed flows. While the understanding and acceptance of 
the FDF methods for combustion simulations have been growing and versions of the models have been implemented 
in commercial software products, their use for cases of practical interest has been limited. This is mainly due to 
computational costs associated with solving the FDF transport equations in complex three-dimensional (3D) flow con- 
figurations that are now commonly utilized for design. Therefore, just as improvements to algorithms for solutions of 
Navier-Stokes equations greatly contributed to reduction of overall computational cost, improvements to algorithms 
for solutions of the FDF are expected to have a similar impact. 

The computational costs of turbulence-chemistry interaction models are further increased by the numerical cost 
of evaluating of detailed, physics-based molecular diffusion and finite-rate chemical kinetics models. 34 The former 
describes the molecular diffusion phenomena which, as discussed above, are paramount to (sub-grid) molecular 
mixing processes leading to sustained chemical reactions. The latter describes rates of conversion processes that 
recombine fuel and oxidizer species into combustion products. This paper focuses on techniques used to reduce 
the computational cost of finite-rate chemical kinetics models with four dimensional (three spatial dimensions and 
time) simulations in mind. This is because it is widely recognized that the numerical computations of chemical 
reactions represent a large fraction of total simulation cost. In general, each species component deemed important 
for accuracy of a reacting simulation requires the solution of a transport equation. Even for simple fuels, such as 
hydrogen, 35 this requirement alone can result in a large number of transport equations as compared to non-reacting 
flows. However, it is not only the increase in the total number of transport equations that poses a great challenge. 
The number of rate expressions required for evaluation of the species’ chemical source terms is typically about five 
times that of the number of species. 34 Additionally, those rate expressions often involve complicated, non-linear 
functions or algebraic expressions that include exponential, products of scalar variables raised to non-integer powers, 
and contain a disparate range of time scales resulting in stiffness. The stiffness, in particular, poses a challenge for 
numerical time integration algorithms. Combined, these challenges are so great that some of the most accurate detailed 
chemical kinetics mechanisms that include pollutant formation details (e.g. NOx chemistry) , containing thousands of 
species and tens-of-thousands of rate coefficients, can only be used within zero (e.g. perfectly stirred reactor) or one- 
dimensional (e.g. opposed flow flames) simulations. However, attempts at addressing computational issues related 
to efficiently solving detailed reacting systems may be in vain if the accuracy of the chemical kinetics mechanism to 
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predict key combustion phenomena, such as ignition delay time and flame propagation speed, is not improved. This is 
especially important in high speed propulsion applications where optimal combustion efficiency is often achieved in 
combustion regimes at low-to-moderate pressures (0.5-5 atm) and, because of high flow velocities, the combustion is 
increasingly sensitive to both ignition delay time and flame propagation speed. 

Given the availability of accurate chemical kinetics mechanisms, the numerical costs associated with integrating a 
large number of stiff transport equations containing complicated source terms can be lowered by reducing the number 
of transport equations, accelerating rate calculations, and removing stiffness. A detailed discussion of current state- 
of-the-art is presented in Lu and Law. 34 One promising technique that offers to address the three challenges outlined 
above is the flamelet generated manifold (FGM) model. 36 This model considers a tabulated chemistry parametrization 
technique that reduces a total number of species to a much smaller number of important progress variables (for which 
transport equations must be solved and which parametrize all other species). Simultaneously, it is possible to pre- 
compute and pre-integrate the chemical source terms as a function of progress variables to avoid the costly on-the-fly 
reaction rate evaluations and stiff integration. Theoretically, the accuracy of the model approaches that of the detailed 
chemical kinetic scheme from which it is derived in the limit of number of progress variables approaching the number 
of species. The numerical implementations of FGM must include construction of an efficient table lookup/interpolation 
algorithm. The main benefit of tabulated chemistry frameworks is the flexibility to incorporate practical experience 
about important processes and parameters into the definitions of the parametrization and weigh fidelity (i.e. number of 
progress variables) against computational cost. When detailed chemical kinetic predictions are beyond computational 
feasibility, both of these elements allow for a pragmatic approach to simulations of reacting flows that offer major 
improvements over the current state-of-the-art. 

II. Compressible FDF formulation for first generation models 

The following section describes the mathematical formulation of the first generation of FDF-based high fidelity 
models for the turbulence-chemistry interactions in LES of high speed flows. The current formulation is based on the 
mass-weighted SFDF, termed the joint scalar filtered mass density function (SFMDF), developed for LES of low-speed 
reacting flows by Jaberi et al. 37 The SFMDF is further modified to account for the compressibility effects following 
the RAS/PDF work of Hsu et al. 38 


A. Governing Equations 

Implementation of LES involves the use of the spatial filtering operation 3 

/ +oo 

Q{x!,t)G{x' - x)dx' (1) 

-oo 

where Q denotes the temporally invariant, localized, symmetric, and positive filter kernel of width A/ , and (Q(x, t)) e 
represents the filtered value of the transport variable Q(x, t). In variable density flows it is convenient to consider the 
Favre filtered quantity, (Q(x,t)) L = {pQ) e /{p}^- 

The flow field to be simulated is unsteady, three-dimensional (3D), and involves gaseous (single-phase) combus- 
tion. Newton’s law of viscosity, Fourier’s law of heat conduction and Fick’s law of mass diffusion are employed. The 
caveats in the use of these laws in reacting flows are recognized. 39 The primary transport variables are the fluid density 
p , the velocity vector iq , i = 1,2,3 along the x t direction and at a time t, the pressure p, the mass fractions of n s 
species, Y a (a = 1,2,..., n s ), and the sensible enthalpy h s . 

The transport variables satisfy the conservation equations of mass, momentum, species’ mass fractions, and 
enthalpy. The filtered form of these equations is: 

9(p)t d(p) e (u z ) L = 
dt dxi 

d(p)e( u j) L , 9{p) e (uj) L (uj) L = d(p) e 
dt d Xi dxj 

9{p)Mq)l + 9{p) t {uj) L {K) L = _ 

dt dxi dxi 


@Tij 

dxi dxi 
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dM a 
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where the scalar fields are denoted by </> a = Y n , a = 1, . . . ,n s , and S a is the production rate of species a. For 
a = n s + 1, 4 > n s + 1 = h s = h%Y a , and S nB +i = S r + S c , where the superscripts r and c denote contributions 

from the heat release due to combustion, and compressibility effects and viscous dissipation, respectively. These 
source terms can be expanded as follows: S r = — A h° a S a , where A h° a is the enthalpy of formation of species 

a, and S c = ^ + T ik§^J- where V /Vt denotes the material derivative, and r, ? is the viscous stress tensor. 

Equations (2, 3, and 4) are closed by the constitutive relation, 40 


(P)t 


I L g 

a=l 



(5) 


where 1Z is the Universal Gas Constant, W a is the molecular weight of species a and T is the temperature. t 13 and ./“ 
denote the viscous stress tensor and the scalar fluxes, respectively. 


( T ij)e 



9{uj) L 

dXi 


2 d(u k ) L 
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(p)e {D)l 


9(<p a ) L 

dxi 


(7) 


where /i is the molecular coefficient of viscosity, and D is the molecular diffusion coefficient. The current formulation 
assumes equal diffusion among the species and heat, i.e. Le a = Le = 1. This is reasonable because errors associated 
with not modeling preferential diffusion effects in high Reynolds number turbulent reacting flows are expected to be 
small. 41 

The SGS closure problem is associated with = (p) ^{(uiUj) L — (ui) L (uj) L ) and M“ = (p) e ((ui(f> a ) L — 
{ui) L ((/) a ) L ), respectively denoting the SGS stresses and SGS fluxes. These closures are based on models well 
established in non-reacting flows. 3 42-44 In this work, algebraic closures based on eddy viscosity and eddy diffusivity 
will be utilized. In compressible reacting flows, additional models are required for the filtered compressibility and 
viscous dissipation terms, i.e. ( pS c ) e , and filtered reaction rates of the mass fractions, i.e. (pS a ) e , a = 1, . . . , n s . 
The models for the former have presented a challenge and only few have been in use in RAS with mixed results. 43 - 44 
The models for the latter are provided by the FDF methods. 


B. Scalar filtered mass density function 

The SFMDF, denoted by F f , is formally defined as 


Ft (</>; x, t) 


r» + 00 


P (x', t) CO, <Mx', t))£(x' - x) dx' 


( 8 ) 


n s 

CO) 0(x, f)) = <50 - 0(x, t)) = £Oa - (x, t)) (9) 

Oi= 1 

where <5 denotes the delta function and ip denotes the sample space of the scalar array. The term ( is the “fine-grained” 
density, 22-45 hence Eq. (8) defines the SFMDF as the mass weighted spatially filtered value of the fine-grained density 
function. With the filter properties specified in the previous section. Ft has all of the properties of a PDF. 22 The 
SFMDF transport equation can be obtained by applying the above definitions to the unfiltered form of scalar transport 
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equation, Eq. (4). The result, after some algebraic manipulation, is 
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Compressibility effects 


(10) 


The above transport equation for the SFMDF contains conditionally filtered terms that are unclosed. These terms 
are the resolved and SGS transport, micromixing, and the compressibility effects. The respective models for these 
terms utilize the conventional gradient diffusion, and the interaction by exchange with the mean (IEM) model. 46 In the 
first generation FDF models, the resolved pressure fluctuations are assumed dominant, and the compressibility effects 
are assumed negligible within the subgrid. This assumption allows us to remove the conditional filtering operations 
from both the chemical source terms and the compressible effects term. Recent FDF models of Nik et al., 31 based 
on previous work in RAS of Delarue and Pope, 47 consider a fully coupled velocity-scalar-pressure joint FDF for 
FES of compressible reacting flows thereby accounting for compressibility effects within the subgrid. With the above 
modeling assumptions the transport equation for the SFMDF becomes. 
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where D t is the subgrid diffusivity, Q m is the frequency of mixing within the subgrid, and the terms due to chemical 
reactions appear in a closed form. The above equation can be integrated to obtain transport equations for the scalar 
moments. The equations for the first Favre moment, {<p a ) L , and the generalized variance, <r^ = {<Pa) L ~ ( <pa)\, are: 
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C. Stochastic system 

The most convenient means of modeling and solving the FDF transport equation is via the stochastic differential equa- 
tions (SDEs) 48 ' 49 and the Fagrangian Monte Carlo (MC) procedure, 50 respectively. 22 24 The basis of this procedure 
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is the same as in other recent RAS 51-54 and LES. 31,33,55-58 Therefore, only some of the fundamental properties of 
the methodology will be described below. With the Lagrangian procedure, the FDF is represented by an ensemble of 
computational stochastic elements (or particles) which are transported in the physical space by the combined actions 
of convection and diffusion (molecular and turbulent). In addition, transport in the scalar composition space occurs 
due to chemical reaction, micromixing, and compressibility effects. These physical processes are described by the set 
of SDEs. The stochastic diffusion process is considered for this purpose, 

d.Xi(t) = m i (X{t),t)dt + T, ij {X(t),t)dW j (14) 

where X, is a vector of i = 1, ... ,n diffusion processes, rn, is the drift vector, E ?J - is the diffusion tensor, and Wj 
( j = 1, . . . , m) denotes the Wiener-Levy processes (random walk). The SDEs used here are 
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d<S>t 

#n.+l 


’’ i)L + We d{P)ei{ dlt + ^ ) dt+ \/ 2{{D)l + Dt) dWl 
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dt 
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where the x + and dd denote Lagrangian position and scalar value, respectively. The Fokker-Planck equation 
corresponding to Eqs. (15-17) is equivalent to Eq. (1 1). 22 Thus, the solution of these SDEs represents the single-point 
FDF in the probabilistic sense. The above formulation has two disadvantages: first, it is unable to model the effects of 
preferential species and heat diffusion, and second, it gives rise to a spurious production of scalar variance (i.e. second 
term on the right hand side of Eq. (13). Both of these issues have been addressed by the work of McDermott and 
Pope. 59 


D. Numerical Procedure 

The FDF is solved via a hybrid Eulerian and Lagrangian MC procedure. 51,53,60 For a numerical solution of the Eulerian 
field, structured or unstmctured, finite-difference or finite-volume (or other) approaches can be utilized. The coupling 
between the Eulerian solver and the Monte Carlo procedure is enacted via the source terms in the sensible enthalpy 
transport equation which is redundantly solved by the hybrid solver. 

The MC solver provides the Favre averaged scalar fields by considering a set of n p particles that evolve according 

(n) 

to the SDEs described by Eqs. (15-17). Each particle carries the information pertaining to its position x\ and scalar 

(n) 

value, (jy a for n = 1 . . . n p . The simplest way of performing the temporal integration of SDEs is via the Euler- 
Maruyamma approximation 49,50 

Xl n \t k+1 ) = X x {n \t k ) + m[ n \x(t k ),t k ) At + X%\x(t k ),t k ) At * $ n \t k ) (18) 

where ^ n \t k ) is an independent standardized Gaussian random variable. Additionally, the temporal integration is 
performed using a splitting technique which is necessary to facilitate integration of the stiff chemical source terms. 
That is, first, the scalar values of each particle are updated via a mixing model, and second, the chemical source 
terms are integrated using the updated scalar values as initial condition. This simple two-step process is shown to be 
first-order accurate, although higher order splitting approaches have also been recently developed. 61 To understand the 
operational procedures of the hybrid configuration, the elements of the computation are shown in Fig. 1. 

This figure shows the MC particles randomly distributed and freely moving within the computational domain. The 
particle transport is Lagrangian; thus the solution is free of constraints associated with typical convection on fixed grid 
points. Statistical information is obtained by considering an ensemble of n particles residing within each computational 
cell volume. The transfer of information from cells to the MC particles is accomplished via interpolations. 

To reduce the computational cost and maintain statistical accuracy in variable density flows, the MC particles enter 
the computational domain uniformly but carry a weight (uj) that is proportional to the density at the inflow. The Favre 
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Figure 1. Hybrid numerical procedure, (a) Two dimensional schematic of the elements involved in computation using a typical hybrid 
Eulerian-Lagrangian solver. Black boxes denote the computational cells, and circles the Lagrangian particles, respectively, (b) Three 
dimensional hybrid simulation illustrating Lagrangian particles being convected over Cartesian grid. The colors of the particles denote 
their scalar values. 


averaged value of a quantity Q(</>(x, t)) is then obtained by a weighted average 

E„ e( A)3 <*> (n) Q(<t> {n) ) 


O(0(x,f)) 


E 


nG( A ) 3 


j( n ) 


(19) 


Particle cloning and clustering strategies are also employed. 53 These strategies add and remove stochastic particles, 
respectively, in areas of the flow where the particle number density exceeds predefined lower and upper limits. In 
steady-state RAS/PDF methods, a variable time-stepping algorithm can also be utilized to accelerate the convergence 
of the MC simulation. 62 The consistency of the hybrid numerical approach is evaluated by comparing the ensemble 
averaged means (or variances) of the scalar variables, since these quantities can be solved independently by the 
Eulerian and Lagrangian solvers. 


III. Particle tracking 

One of the key numerical elements of the hybrid FDF solver is an efficient particle tracking algorithm. This is 
because a typical hybrid FDF simulation will require a minimum of about 20 MC particles per computational cell 
for accuracy. 60 Given that a high fidelity LES of a practical high-speed combustor device will currently utilize about 
10-100 million computational cells, the number of MC particles can reach 0.2-2 billion. There are two aspects related 
to efficiency: first, the mathematical formulation of the tracking algorithm itself; second, the efficient, parallelizable, 
and scalable numerical implementation. The tracking algorithms can range from readily parallelizable cell-index 
identification methods applicable only on uniform Cartesian meshes 63,64 to methods utilizing near-neighbor search 
applicable on unstructured meshes. 65 An efficient and robust compromise between these two, that is also applicable 
to complex geometries and both structured and unstructured grids, is a convex polyhedron method developed by Pope 
and described in detail by Subramaniam and Haworth. 66 This method is similar to one used by Ansari et ah, 65 however 
it does not require the near-neighbor search because the particles are always partially advected to the cell faces first. 
Since the cell faces contain information about their neighbors the advection can continue for the remainder of the 
particle time step inside the correctly identified cell without the need for a search. Because near neighbor searches are 
computationally expensive, this search-less algorithm can potentially lower the computational costs associated with 
particle tracking. The approach of Subramaniam and Haworth 66 is also robust because it does not require that the 
particle convective Courant-Friedrichs-Lewy (CFL) number be less then one. Such a requirement is often enforced 
when near-neighbor search algorithms are employed to limit the extent of the search domain. Furthermore, particle 
CFL requirements may be misleading. This is because the particle CFL number is typically defined based on some 
characteristic length, e.g. minimum cell edge length. Independently of such definition, particles can still traverse 
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multiple cells along the cell comers even in the limit of the particle CFL number approaching zero. This could result 
in a numerical “loss-of-tracking” of particles, and more importantly, loss of mass conservation in the Lagrangian 
simulation causing errors that are much larger than those associated with truncation errors of the Lagrangian solver. 
Stopping particles at the cell faces as proposed by Subramaniam and Haworth 66 introduces additional modeling 
benefits in flows with flow discontinuities and in multiphase flows. In the former, since the particles’ advection 
properties (e.g. velocities) are updated at the cell faces, the particle can respond to compressible flow features (such as 
shocks) as soon as they are encountered. In the latter, the mass associated with evaporating fuel from the Lagrangian 
droplet can be accurately assigned to the corresponding Eulerian cells in which evaporation occurred. 

IV. Coupling LES/FDF with RAS/PDF 

Predictions of high Reynolds number, wall-bounded flows remain a challenge for LES because the grid size 
requirements necessary to resolve the turbulent boundary layer are on the order of those required for the corresponding 
DNS . 6 This requirement is less severe in RAS because only the wall-normal direction must be “resolved” for accurate 
near-wall solutions. Furthermore, RAS has benefited from several decades of near-wall modeling efforts which 
resulted in the development of models that allow for the reduction of wall-normal grid spacing without significant 
loss in predictive accuracy . 43 Given the benefits of RAS it seemed natural to develop hybrid LES/RAS methods for 
LES of wall-bounded flows . 67 While there is no reason to expect that the wall-models developed for non-reacting RAS 
will perform well in near-wall turbulent, reacting flows, the hybrid LES/FDF-RAS/PDF method with the wall-normal 
directions resolved should be a good candidate for such simulations. This is because such a hybrid model would allow 
for high fidelity modeling of a flame near and through the turbulent boundary layer. 

The turbulent flame-wall interactions (FWI) are of considerable importance when designing practical combustor 
devices. This is because these interactions have a major impact on both the mean and maximum heat fluxes to the 
wall. The former guides the cooling requirements, and the latter impacts the lifetime of a combustor. The mechanisms 
of FWI are somewhat understood and recent DNS data are available . 68 Two canonical configurations are typically 
considered: First is the head-on-quenching (HOQ), where the flame front travels perpendicularly to the wall and 
impinges on it causing flame quenching leading to maximum heat flux. Second is the side-wall quenching (SWQ), 
where the flame front travels parallel to the wall with the quenching of the flame front edge nearest to the wall. Also, in 
both low and high speed boundary layer flows, the flame propagation inside the boundary layer may cause flashback, 
i.e. upstream propagation of the flame-front through the low velocity boundary layer regions. While, flashback is 
unlikely in high-speed boundary layers with favorable pressure gradients, when the boundary layer is weakened by 
adverse pressure gradient and/or impinging shock, flashback may become possible. Furthermore, if during such shock- 
boundary-layer interactions (SBLI) the shock impinges on a boundary layer containing a premixed fuel-air mixture 
then ignition may occur causing localized peak heat fluxes that are much larger than those due to SBLI alone. 

The ability to model turbulent FWI has the potential to improve predictions of not only internal reacting flows but 
also external hypersonic flows such as flows over hypersonic vehicle bodies and reentry vehicles. Duan and Martin 69 
have shown that modeling turbulence-chemistry interactions inside the boundary layers of dissociating air may be 
important to improve predictions of high enthalpy external flows. The coupling of the hybrid LES/FDF-RAS/PDF 
method could be achieved by utilizing a blending function of the transport parameters. However, further research is 
required to determine the form of such a blending function as the reacting boundary layer is likely to exhibit non- 
universal behavior. 


V. Accelerating chemistry 

A. Flamelet Generated Manifolds for compressible flows 

Simulating detailed chemistry in reacting flows requires significant computational resources because chemical source 
terms are highly non-linear and numerically stiff. Direct integration using a detailed reaction mechanism in LES 
or hybrid LES/RAS is currently computationally intractable for flows typically found within a scramjet combustor 
because the necessary additional computational resources render such simulations too expensive for geometries of 
practical interest. That being said, recent advances by the authors toward massively scalable implementations of 
LES/FDF using detailed chemistry and utilizing petascale platforms show great potential and will be discussed in the 
following section. Nevertheless, in order to make substantial progress toward the simulation of high-speed combusting 
flows, simulations must be performed using less-expensive, but similarly accurate, treatments of the finite-rate 
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chemistry. Several methods have been proposed for reducing the complexity of finite-rate kinetics, including: reducing 
reaction mechanisms using sensitivity analyses, 70 ' 71 dynamic tabulation and look-up of relevant chemical quantities 
using the in-situ adaptive tabulation (IS AT) approach, 72 and pre-tabulation of relevant chemical quantities using a 
flamelet-derived method 36 that combines the advantages of laminar flamelets 73 and the Intrinsic Low Dimensional 
Manifold (ILDM) method. 74 However, these current methods are inadequate for supersonic combustion, in which 
compressibility, pressure, and turbulence effects play a significant role in the combustion physics. 

Flamelet-generated manifolds (FGM) have been used extensively as a means of simulating low-Mach number 
combustion due to their inherent compatibility with varying-complexity reaction mechanisms and drastic reduction 
in simulation runtime. 36 73 In its most-general form, FGM reduces detailed chemistry to a low-dimensional manifold 
in composition space parameterized by a reduced set of controlling variables. Prior to simulation, a set of one- 
dimensional laminar counter-flow flames are solved with varying initial conditions, and a table is constructed by 
parameterizing the flamelet solutions by mixture fraction and a reaction progress variable. During simulation, transport 
equations for the mixture fraction and progress variable are solved, in addition to those for the mass, momentum, and 
energy. At each integration step, the transported mixture fraction and progress variable are used to lookup chemical 
quantities from the table, such as chemical composition data, required to integrate the energy equation forward in time. 

The development of the flamelet equations requires several limiting assumptions, including: incompressibility 
(low Mach number), constant pressure, and steady-state. 36 Thus, the validity of these equations diminishes for 
highly-compressible supersonic combusting flows. Recently, various methods for modifying the FGM approach to 
include pressure, compressibility, and unsteady effects have been proposed in the literature, and several approaches 
are described briefly here. Bekdemir et al. 76 simulated diesel engine combustion using the FGM approach in which 
flamelet solutions were parameterized by mixture fraction, a progress variable, and pressure. Flamelets were computed 
and tabulated for seven pressure levels ranging from the pressure at fuel injection to the point of maximum pressure. A 
pressure-dependency study indicated that a minimum of five logarithmically-distributed pressure levels were necessary 
to adequately represent pressure variation through compression and combustion. However, averaged results from the 
FGM simulations indicated artificially-high pressure rises as a consequence of unrealistic short burn durations. Based 
on these findings, the researchers suggested future work include: better capture of the strain rate dependency of 
ignition and investigation into using both premixed and non-premixed flamelet solutions. Although this work focused 
on subsonic combustion, the notion of adding pressure as an additional dimension to an FGM may prove useful for 
the case of supersonic combustion. 

Rather than parameterize the FGM table by pressure, Emory et al. 77 attempted to include the effects of com- 
pressibility and pressure on the simulation for the HyShot II hydrogen-fueled scramjet experiment by imposing a 
pressure-dependent scaling on the progress variable source term. Rather than use the temperature corresponding to the 
flamelet solutions, the transported total energy was used to derive an analytical formulation for temperature by first 
approximating the total energy according to Eq. 20 and then linearly expanding in temperature the ratio of specific 
heats around the tabular value in Eq. 21. Performing the integration analytically in Eq. 20 and rearranging yielded 
the closed-form approximation for temperature included in Eq. 22. Without this linearization, a Newton-Raphson 
method would be required to solve for the temperature. The effect of pressure on the combustion is approximated 
using a scaling on the progress variable source term. Since the combustion of hydrogen is governed primarily by 
bimolecular reactions, the researchers approximated the progress variable source term according to Eq. 23, in which 
the source term computed at a reference pressure is scaled by the square of the normalized pressure to yield a pressure- 
dependency. Simulations at various equivalence ratios for the HyShot II experiment indicated qualitative agreement 
with experimental data. 
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Lacaze and Oefelein 78 performed a detailed study on the use of flamelets for modeling non-premixed combustion 
at supercritical pressures for simulations of rocket engines, where pressures exceed 50 atm. Detailed opposed-jet 
hydrogen-air flame calculations found that: 1) the flame was thin and robust over a wide range of strain rates, 2) 
the flame structure in mixture fraction space was approximately independent of pressure, 3) within the reaction zone, 
temperature had little effect on flame structure, and 4) the heat release was sensitive to pressure and strain fluctuations. 
Based on these findings, the researchers devised an FGM approach for compressible reacting flow simulations within 
the supercritical pressure regime based on coupling the total non-chemical energy transport equation and table through 
a modified chemical source term. Pressures experienced during supersonic combustion within a scramjet are not 
supercritical and typically range from 0. 5-5.0 atm with significant flame strain effects. Thus, the results of Lacaze and 
Oefelein 78 maybe of limited use for such flows. 

Though attempts have been made at including compressible and pressure effects in the FGM framework, current 
methods are still ill-equipped to simulate the strongly unsteady, compressible phenomena comprising dual-mode 
scramjet combustion. Within this flow regime, higher-fidelity implementations of FGM that include more robust 
transport equation-table couplings and detailed pressure and compressibility effects will be necessary. To further 
ascertain unsteady, turbulence-driven flow properties, such a high-fidelity FGM approach will likely necessitate an 
LES or hybrid LES/RAS solution framework. 

B. Irregularly Portioned Lagrangian Monte Carlo Finite Difference for scalable reacting simulations with 
finite-rate chemistry 

Petascale computing has been a reality for open research for the past couple of years, and the exascale platforms are 
the current technological trend which are expected to be available by the end of this decade. 79 Being able to take 
advantage of the enormous opportunity of such extreme scale computing platforms is vital to the success of LES and 
its application in industrial practice. 

Design and implementation of scalable parallel algorithms is the key enabler in the petascale arena. However, this 
is not a trivial task, especially in LES of reacting flows where complex chemistry calculations typically dominate the 
computation. The issue is further exacerbated by the dynamic and inhomogeneous nature of the flow. The variation 
in composition affects the level of stiffness of the chemistry equations. These variations cause the computational load 
to vary significantly throughout the solution domain, which negatively impacts the duration of a simulation. This 
phenomena has been observed in reacting computations with detailed or reduced chemistry via direct integration and 
may also be relevant to acceleration techniques discussed in the previous section. The impact of it is made evident in 
Fig. 2 which shows the variation in the computational load at some instant during the simulation of Sandia flame D, 80 
using direct integration. It is observed that the computational requirement is highly non-uniform throughout; virtually 
no time is spent for calculations near the cold jet or the cold surrounding air, and most of the computational load is 
concentrated around the hot pilot. 

This nature of reacting flow renders straightforward parallelization techniques ineffective, limiting scalability at 
only 100s of computing units as will be shown. An effective parallelization must be adaptive and be driven by the 
dynamics of the flow. We have developed and implemented such a methodology for LES/FDF, termed “Irregularly 
Portioned Lagrangian Monte Carlo Finite Difference” (IPLMCFD). This is a dynamic load balancing technique that 
is especially effective for structured mesh configurations, but is also applicable in a straightforward manner for 
unstructured meshes. It allows for efficient LES of reacting flow on thousands of computing units, and in the context 
of FDF, has a potential to scale to hundreds of thousands of computing units by readily being able to take advantage 
of the heterogeneous multicore nature of today’s petascalable systems. 

A popular parallelization strategy in modern CFD is via temporally invariant block decomposition where the 
mesh is partitioned into equally sized boxes, and each box is assigned to a processor (Fig. 3(a). This uniformity is 
relatively easy to implement and yields a minimal communication overhead. But for unsteady and inhomogeneous 
flows, it usually leads to a poor load distribution. Processors with lighter loads must wait (and remain idle) until 
the synchronization at the end of each time step. The local computation time of each processor for this is shown in 
Fig. 3(c). In this example, the idle time is about half of the total time. We show that the load imbalance problem can 
be fully resolved by portioning the domain irregularly and adaptively. In doing so, the Eulerian mesh is represented 
as an undirected graph where particle cells are the vertices of the graph and are weighted by the computational load. 
Each vertex is assigned a computational weight, i.e. a computation-load metric, which is a function of heterogeneous 
and homogeneous computational loads. This weighted graph is then fed into a graph partitioning algorithm 81 which 
subdivides the domain into clusters of particle cells on which the computational load is evenly distributed. Figure 3(d) 
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Figure 2. Instantaneous CPU time in milliseconds spent performing particle integrations (left). Instantaneous filtered temperature (right). 


shows the load distribution corresponding to the balanced irregular decomposition. Here, the idle time is insignificant. 
The transient load re-distribution problem is envisioned to be resolved by recomputing the load metric, repartitioning 
the domain, and intercommunicating the local data of the partitions as the simulation proceeds. The frequency of 
re-distribution is adjusted based on the communication cost and the extent of load imbalance. The issue of local data 
migration required for such a dynamic load balancing without sacrificing scalability has been investigated by many, 
e.g. see Devine et al. 82 . Furthermore, software such as Zoltan 83 can readily deal with this issue. 

Our current implementation of IPLMCFD for LES/FDF employs the message passing interface (MPI) for dis- 
tributed parallelism via inter-partition communication, and can scale up to 1000s of partitions via the adaptive por- 
tioning as described. From Fig. 4, it is evident that the IPLMCFD methodology outperforms uniform decomposition. 
Not shown on the plot are the actual walltimes. In particular, the sequential run for this configuration takes about 
3,000 seconds per iteration with 256 processors, uniform partitioning has 45/250 = 18% scalability with 66.7 
seconds per time step, whereas IPLMCFD soars at 97% scalability bringing down the walltime to 11.4 seconds. In 
other words, a mere load balancing with the same amount of resources provides almost a 6 fold increase in compu- 
tational throughput! By moving into the thousands of processors range an unbalanced decomposition is simply not 
feasible. IPLMCFD scales almost ideally up to 4000 processors, and starts to tail off slightly beyond this limit as the 
effects of communication become more and more pronounced. Nevertheless, for such a moderately sized simulation, 
it performs favorably. It is important to emphasize that in strong scalability analyses, the size of the simulation is 
kept fixed for all processors. We project this tailing-off to surpass 10,000 processors for larger simulations with which 
the data-to-communication ratio is much higher. This and, other weak scaling properties will be discussed in a future 
publication. 84 

Ability to utilize 0(1O 4 ) processors effectively in order to conduct LES of laboratory scale flames within hours, 
instead of months, has been a major step forward. However, ten thousand processors do not qualify as “petascale,” and 
will be rather insufficient given that our objective is to be able to conduct much larger, realistic, and industry scale sim- 
ulations quickly. It is important to emphasize once again that the analysis shown thus far has been for purely message 
passing, single processor-core per partition runs. The next phase of improvements will substantiate from recognizing 

"The mesh size for this analysis is moderate enough to allow for sequential runs; otherwise, in a more realistic geometry and mesh size, memory 
and time constraints would be prohibit a sequential run. 
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Figure 3. Domain topology with (a) the uniform decomposition, and (b) adaptive irregular decomposition. Non-idle CPU times per 
time-step for each rank for subsequent time-steps with (c) uniform decomposition, and (d) adaptive irregular decomposition. 
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Figure 4. Ratio of serial-to-parallel simulation time versus the number of processors demonstrates the strong scaling results for IPLMCFD 
in comparison with those obtained with uniform decomposition. Also shown is the scaling result for single partition per compute node 
(hybrid). These analyses are conducted on Kraken, the supercomputer at the National Institute for Computational Sciences (NICS), 
Knoxville, TN. 
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the trend in today’s supercomputing: petascale computers are becoming predominantly heterogeneous in nature 85 I . 

As discussed in Section II. D., MC simulations typically require order of millions to billions of particles; moreover, 
as indicated in Section V.A., the evaluation of chemistry equations, Eqs. (16), and (17) is computationally the most 
significant portion of per partition workload. With a balanced distribution of particle workload to separate and 
independent partitions, a further local refinement can be made whereby the work for local ensembles of MC particles 
can be split out to individual CPU cores and/or graphical processing units (GPUs), using shared memory parallelism 
(for example, via OpenMP 86 ), and general purpose GPU programming (for example, via NVIDIA/CUDA 87 or 
OpenCL 88 ), respectively. Given thread-safe and GPU based implementations of chemistry and stiff-solver routines, 
such fine grained parallelism can be implemented scalably in a straightforward fashion. This is an active area of 
research, and some example implementations are available in the literature. 89,90 These efforts, in combination with 
the virtually unbounded, distributed, parallel scalability enabled by IPLMCFD, will provide numerical frameworks 
which are able to leverage petascale platforms effectively for high-fidelity and highly reliable LES of industry scale 
applications. 


VI. Conclusions 


The scalar filtered mass density function (SFMDF) model for large eddy simulations (FES) of high speed reacting 
flows has been presented. The key benefit of this model is that the most challenging aspect of modeling the reacting 
flows (both low and high speed) namely the filtered reaction source terms appears in closed form. The current version 
of the model requires a unity Fewis number and introduces spurious production of scalar variance. Both issues have 
been resolved by modifications to this model. A less defensible assumption embedded in the current SFMDF is 
that of constant pressure within the subgrid. This, however, is addressed by the next generation model, the joint 
velocity-scalar-pressure filtered density function (FDF), although an intermediate joint scalar-pressure formulation 
may also be possible. To solve the FDF transport equation, a hybrid Eulerian-Fagrangian solver is utilized. The 
Eulerian solver integrates the continuity, momentum, and energy equations via a finite-difference or finite-volume 
(or other) method on either structured or unstructured grids. The Lagrangian solver integrates a set of stochastic 
differential equations using a Monte Carlo particle method. The key element of the latter is a robust, efficient, and 
parallelizable particle tracking algorithm based on the convex polyhedron method. To model turbulent flame wall 
interactions important to predicting mean and peak values of the wall heat flux, a coupling of FES/FDF with Reynolds 
averaged simulations (RAS) utilizing probability density function (PDF) method was also discussed. This approach 
is based on the hybrid FES/RAS model developed by Baurle et al. 67 The most computationally intensive component 
of any reacting simulation is the integration of finite-rate kinetics. To reduce the numerical costs associated with 
such integrations the flamelet generated manifold (FGM) model was proposed and discussed. Several challenges 
to applying FGM in high speed compressible flows have been identified. The main ones included the variable 
pressure effects, and unique parametrization of pure mixing states in compressible flows. For cases where finite-rate 
kinetics are desired, the computational costs of numerical integration can be significantly reduced by deploying the 
Irregularly Portioned Fagrangian Monte Carlo Finite Difference (IPFMCFD) method. The combination of modeling 
and numerical capabilities discussed in this paper have the potential to overcome the significant computational cost of 
utilizing FDF methods in practical applications. 
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